Effect of the Front-End Processing on Speaker Verification Performance Using PCA and Scores Level Fusion
نویسندگان
چکیده
This paper evaluates the impact of low-level features on speaker verification performance, with an emphasis on the recently proposed MFCC variant based on asymmetric tapers (MFCC asymmetric from now on) standalone as features or followed by PCA as linear projection technique applied before the GMM-UBM back-end classifier in clean and noisy environments. The performances of the MFCC-asymmetric features are compared with: the standard Mel-Frequency Cepstral Coefficients (MFCC) that extracted from TIMIT corpus, under clean and noisy conditions. A score level fusion framework based on simples linear methods such as min, max, sum, ..., etc. and training methods like SVM is proposed to improve performance and to mitigate noise degradation. The obtained results on corrupted TIMIT database confirm the superiority of fused system in noisy environments against each system alone, and the drastic degradation of the performances of PCA based systems in the presence of environmental noise.
منابع مشابه
Nonlinear Auditory Modeling as a Basis for Speaker Recognition
In this report, we develop a front-end nonlinear auditory model based on recent work of Dau, Puschel, and Kohlrausch (DPK) [Dau, Puschel, and Kohlrausch, 1997]. An important aspect of the model is the robust accentuation of temporal change in a signal at the cochlea level that forms the basis of a feature set for automatic speaker recognition. Preliminary speaker recognition experiments with th...
متن کاملRobust Support Vector Machines for Speaker Verification Task
An important step in speaker verification is extracting features that best characterize the speaker voice. This paper investigates a front-end processing that aims at improving the performance of speaker verification based on the SVMs classifier, in text independent mode. This approach combines features based on conventional Mel-cepstral Coefficients (MFCCs) and Line Spectral Frequencies (LSFs)...
متن کاملFusion of Cross Stream Information in Speaker Verification
This paper addresses the performance of various statistical data fusion techniques for combining the complementary score information in speaker verification. The complementary verification scores are based on the static and delta cepstral features. Both LPCC (Linear prediction-based cepstral coefficients) and MFCC (mel-frequency cepstral coefficients) are considered in the study. The experiment...
متن کاملUsing Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems
Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.
متن کاملMaximum Likelihood i-vector Space Using PCA for Speaker Verification
This paper proposes a new approach to training the i-vector space using a variant of PCA with the Baum-Welch statistics for speaker verification. In eigenvoice the rank of variability space is bounded by the number of training speakers, so a variant of the probabilistic PCA approach is introduced for estimating the parameters. But this constraint doesn’t exist in i-vector model because the numb...
متن کامل